% LaTeX source for textbook ``How to think like a computer scientist''
% Copyright (c)  2001  Allen B. Downey, Jeffrey Elkner, and Chris Meyers.

% Permission is granted to copy, distribute and/or modify this
% document under the terms of the GNU Free Documentation License,
% Version 1.1  or any later version published by the Free Software
% Foundation; with the Invariant Sections being "Contributor List",
% with no Front-Cover Texts, and with no Back-Cover Texts. A copy of
% the license is included in the section entitled "GNU Free
% Documentation License".

% This distribution includes a file named fdl.tex that contains the text
% of the GNU Free Documentation License.  If it is missing, you can obtain
% it from www.gnu.org or by writing to the Free Software Foundation,
% Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
\chapter{Variables, expressions and statements}

\section{Values and types}
\index{value}
\index{type}
\index{string}

A {\bf value} is one of the fundamental things---like a letter or a
number---that a program manipulates.  The values we have seen so far
are {\tt 2} (the result when we added {\tt 1 + 1}), and
{\tt "Hello, World!"}.

These values belong to different {\bf types}:
{\tt 2} is an integer, and {\tt "Hello, World!"} is a {\bf string},
so-called because it contains a ``string'' of letters.
You (and the interpreter) can identify
strings because they are enclosed in quotation marks.

The print statement also works for integers.

\beforeverb
\begin{verbatim}
>>> print 4
4
\end{verbatim}
\afterverb
%
If you are not sure what type a value has,
the interpreter can tell you.

\beforeverb
\begin{verbatim}
>>> type("Hello, World!")
<type 'str'>
>>> type(17)
<type 'int'>
\end{verbatim}
\afterverb
%
Not surprisingly, strings belong to the type {\tt str} and
integers belong to the type {\tt int}.  Less obviously, numbers
with a decimal point belong to a type called {\tt float},
because these numbers are represented in a
format called {\bf floating-point}.

\index{type}
\index{string}
\index{type!str}
\index{int}
\index{type!int}
\index{float}
\index{type!float}

\beforeverb
\begin{verbatim}
>>> type(3.2)
<type 'float'>
\end{verbatim}
\afterverb
%
What about values like {\tt "17"} and {\tt "3.2"}?
They look like numbers, but they are in quotation marks like
strings.

\beforeverb
\begin{verbatim}
>>> type("17")
<type 'str'>
>>> type("3.2")
<type 'str'>
\end{verbatim}
\afterverb
%
They're strings.

When you type a large integer, you might be tempted to use commas
between groups of three digits, as in {\tt 1,000,000}.  This is not a
legal integer in Python, but it is a legal expression:

\beforeverb
\begin{verbatim}
>>> print 1,000,000
1 0 0
\end{verbatim}
\afterverb
%
Well, that's not what we expected at all!  Python interprets {\tt
1,000,000} as a comma-separated list of three integers, which it
prints consecutively.  This is the first example we have seen of a
semantic error: the code runs without producing an error message, but
it doesn't do the ``right'' thing.


\section{Variables}
\index{variable}
\index{assignment}
\index{statement!assignment}

One of the most powerful features of a programming language is the
ability to manipulate {\bf variables}.  A variable is a name that
refers to a value.

The {\bf assignment statement} creates new variables and gives
them values:

\beforeverb
\begin{verbatim}
>>> message = "What's up, Doc?"
>>> n = 17
>>> pi = 3.14159
\end{verbatim}
\afterverb
%
This example makes three assignments.  The first assigns the string
{\tt "What's up, Doc?"} to a new variable named {\tt message}.
The second gives the integer {\tt 17} to {\tt n}, and the third
gives the floating-point number {\tt 3.14159} to {\tt pi}.

\index{state diagram}

A common way to represent variables on paper is to write the name with
an arrow pointing to the variable's value.  This kind of figure is
called a {\bf state diagram} because it shows what state each of the
variables is in (think of it as the variable's state of mind).
This diagram shows the result of the assignment statements:

\beforefig
\centerline{\psfig{figure=illustrations/state2.eps}}
\afterfig

The print statement also works with variables.

\beforeverb
\begin{verbatim}
>>> print message
What's up, Doc?
>>> print n
17
>>> print pi
3.14159
\end{verbatim}
\afterverb
%
In each case the result is the value of the variable.
Variables also have types; again, we can ask the
interpreter what they are.

\beforeverb
\begin{verbatim}
>>> type(message)
<type 'str'>
>>> type(n)
<type 'int'>
>>> type(pi)
<type 'float'>
\end{verbatim}
\afterverb
%
The type of a variable is the type of the value it
refers to.


\section{Variable names and keywords}
\index{keyword}

Programmers generally choose names for their variables that
are meaningful---they document what the variable is used for.

Variable names can be arbitrarily long.  They can contain
both letters and numbers, but they have to begin with a letter.
Although it is legal to use uppercase letters, by convention
we don't.  If you do, remember that case matters.  {\tt Bruce}
and {\tt bruce} are different variables.

The underscore character ({\tt \_}) can appear in a name.
It is often used in names with multiple words, such as
{\tt my\_name} or {\tt price\_of\_tea\_in\_china}.

\index{underscore character}

If you give a variable an illegal name, you get a syntax error:

\adjustpage{-2}
\pagebreak
\beforeverb
\begin{verbatim}
>>> 76trombones = "big parade"
SyntaxError: invalid syntax
>>> more$ = 1000000
SyntaxError: invalid syntax
>>> class = "Computer Science 101"
SyntaxError: invalid syntax
\end{verbatim}
\afterverb
%
{\tt 76trombones} is illegal because it does not begin with a letter.
{\tt more\$} is illegal because it contains an illegal character, the dollar
sign.  But what's wrong with {\tt class}?

It turns out that {\tt class} is one of the Python {\bf keywords}.
Keywords define the language's rules and structure, and they cannot be
used as variable names.

\index{keyword}

Python has twenty-nine keywords:

\beforeverb
\begin{verbatim}
and       def       exec      if        not       return
assert    del       finally   import    or        try
break     elif      for       in        pass      while
class     else      from      is        print     yield
continue  except    global    lambda    raise
\end{verbatim}
\afterverb
%
You might want to keep this list handy.  If the interpreter complains
about one of your variable names and you don't know why, see if it
is on this list.


\section{Statements}

A statement is an instruction that the Python interpreter can
execute.  We have seen two kinds of statements: print
and assignment.

When you type a statement on the command line, Python
executes it and displays the result, if there is one.  The
result of a print statement is a value.  Assignment statements
don't produce a result.

A script usually contains a sequence of statements.  If there
is more than one statement, the results appear one at a time
as the statements execute.

For example, the script

\beforeverb
\begin{verbatim}
print 1
x = 2
print x
\end{verbatim}
\afterverb
%
produces the output

\beforeverb
\begin{verbatim}
1
2
\end{verbatim}
\afterverb
%
Again, the assignment statement produces no output.



\section{Evaluating expressions}

An expression is a combination of values, variables, and operators.
If you type an expression on the command line, the interpreter
{\bf evaluates} it and displays the result:

\beforeverb
\begin{verbatim}
>>> 1 + 1
2
\end{verbatim}
\afterverb
%
Although expressions contain values, variables, and operators,
not every expression contains all of these elements.
A value all by itself is considered an expression, and so is
a variable.

\beforeverb
\begin{verbatim}
>>> 17
17
>>> x
2
\end{verbatim}
\afterverb
%
Confusingly, evaluating an expression is not quite the same
thing as printing a value.

\beforeverb
\begin{verbatim}
>>> message = "What's up, Doc?"
>>> message
"What's up, Doc?"
>>> print message
What's up, Doc?
\end{verbatim}
\afterverb
%
When the Python interpreter displays the value of an expression, it
uses the same format you would use to enter a value.  In the case of
strings, that means that it includes the quotation marks.  But if
you use a print statement, Python displays the contents of the string
without the quotation marks.

In a script, an expression all by itself is a legal statement, but it
doesn't do anything.  The script

\beforeverb
\begin{verbatim}
17
3.2
"Hello, World!"
1 + 1
\end{verbatim}
\afterverb
%
produces no output at all.  How would you change the script to
display the values of these four expressions?


\section{Operators and operands}
\index{operator}
\index{operand}
\index{expression}

{\bf Operators} are special symbols that represent computations
like addition and multiplication.  The values the operator uses are
called {\bf operands}.

The following are all legal Python expressions whose meaning is more or
less clear:
\adjustpage{2}
\beforeverb
\begin{verbatim}
20+32   hour-1   hour*60+minute   minute/60   5**2   (5+9)*(15-7)
\end{verbatim}
\afterverb
%
The symbols {\tt +}, {\tt -}, and {\tt /}, and the use of parenthesis for
grouping, mean in Python what they mean in mathematics.  The asterisk
({\tt *}) is the symbol for multiplication, and {\tt **} is the symbol
for exponentiation.

When a variable name appears in the place of an operand, it
is replaced with its value before the operation is
performed.

Addition, subtraction, multiplication, and exponentiation all do what
you expect, but you might be surprised by division.  The following
operation has an unexpected result:

\beforeverb
\begin{verbatim}
>>> minute = 59
>>> minute/60
0
\end{verbatim}
\afterverb
%
The value of {\tt minute} is 59, and in conventional arithmetic 59
divided by 60 is 0.98333, not 0.  The reason for the discrepancy is
that Python is performing {\bf integer division}.

\index{integer division}

When both of the operands are integers, the result must also be an integer,
and by convention, integer division always rounds {\em down}, even in cases
like this where the next integer is very close.

A possible solution to this problem is to calculate a percentage
rather than a fraction:

\beforeverb
\begin{verbatim}
>>> minute*100/60
98
\end{verbatim}
\afterverb
%
Again the result is rounded down, but at least now the answer is
approximately correct.  Another alternative is to use floating-point
division, which we get to in Chapter~\ref{floatchap}.


\section{Order of operations}
\index{order of operations}
\index{rules of precedence}

When more than one operator appears in an expression, the order of
evaluation depends on the {\bf rules of precedence}.  Python follows
the same precedence rules for its mathematical operators that
mathematics does.  The acronym {\bf PEMDAS} is a useful way to
remember the order of operations:

\begin{itemize}

\item {\bf P}arentheses have the highest precedence and can be used 
to force an expression to evaluate in the order you want. Since
expressions in parentheses are evaluated first, {\tt 2 * (3-1)} is 4,
and {\tt (1+1)**(5-2)} is 8. You can also use parentheses to make an
expression easier to read, as in {\tt (minute * 100) / 60}, even
though it doesn't change the result.

\item {\bf E}xponentiation has the next highest precedence, so
{\tt 2**1+1} is 3 and not 4, and {\tt 3*1**3} is 3 and not 27.

\item {\bf M}ultiplication and {\bf D}ivision have the same precedence,
which is higher than {\bf A}ddition and {\bf S}ubtraction, which also
have the same precedence.  So {\tt 2*3-1} yields 5 rather than 4, and
{\tt 2/3-1} is {\tt -1}, not {\tt 1} (remember that in integer
division, {\tt 2/3=0}).

\item Operators with the same precedence are evaluated from left to 
right.  So in the expression {\tt minute*100/60}, the multiplication
happens first, yielding {\tt 5900/60}, which in turn yields {\tt 98}.
If the operations had been evaluated from right to left, the result
would have been {\tt 59*1}, which is {\tt 59}, which is wrong.

\end{itemize}


\section{Operations on strings}
\index{string operation}

In general, you cannot perform mathematical operations on strings, even
if the strings look like numbers.  The following are illegal (assuming
that {\tt message} has type {\tt string}):

\beforeverb
\begin{verbatim}
 message-1   "Hello"/123   message*"Hello"   "15"+2
\end{verbatim}
\afterverb
%
Interestingly, the {\tt +} operator does work with strings, although it
does not do exactly what you might expect.  For strings, the {\tt +} operator
represents {\bf concatenation}, which means joining the two operands by
linking them end-to-end.  For example:

\index{concatenation}

\beforeverb
\begin{verbatim}
fruit = "banana"
bakedGood = " nut bread"
print fruit + bakedGood
\end{verbatim}
\afterverb
%
The output of this program is {\tt banana nut bread}.  The space
before the word {\tt nut} is part of the string, and is necessary
to produce the space between the concatenated strings.

The {\tt *} operator also works on strings; it performs repetition.
For example, {\tt "Fun"*3} is {\tt "FunFunFun"}.  One of the operands
has to be a string; the other has to be an integer.

On one hand, this interpretation of {\tt +} and {\tt *} makes sense by
analogy with addition and multiplication.  Just as {\tt 4*3} is
equivalent to {\tt 4+4+4}, we expect {\tt "Fun"*3} to be the same as
{\tt "Fun"+"Fun"+"Fun"}, and it is.  On the other hand, there is a
significant way in which string concatenation and repetition are
different from integer addition and multiplication.
Can you think of a property that addition and multiplication have
that string concatenation and repetition do not?


\section{Composition}
\index{composition}

So far, we have looked at the elements of a program---variables,
expressions, and statements---in isolation, without talking about how to
combine them.

One of the most useful features of programming languages is their
ability to take small building blocks and {\bf compose} them.  For
example, we know how to add numbers and we know how to print; it turns
out we can do both at the same time:

\beforeverb
\begin{verbatim}
>>>  print 17 + 3
20
\end{verbatim}
\afterverb
%
In reality, the
addition has to happen before the printing, so the actions aren't 
actually happening at the same time. The point is that any
expression involving numbers, strings, and variables can be used inside a
print statement.  You've already seen an example of this:

\beforeverb
\begin{verbatim}
print "Number of minutes since midnight: ", hour*60+minute
\end{verbatim}
\afterverb
%
You can also put arbitrary expressions on the right-hand side of an
assignment statement:

\beforeverb
\begin{verbatim}
percentage = (minute * 100) / 60
\end{verbatim}
\afterverb
%
This ability may not seem impressive now, but you will see other examples
where composition makes it possible to express complex computations neatly and
concisely.

Warning: There are limits on where you can use certain expressions.  For
example, the left-hand side of an assignment statement has to be a
{\em variable} name, not an expression.  So, the following is illegal:
{\tt minute+1 = hour}.


\section{Comments}
\index{comment}

As programs get bigger and more complicated, they get more difficult to
read.  Formal languages are dense, and it is often difficult to look
at a piece of
code and figure out what it is doing, or why.

For this reason, it is a good idea to add notes to your programs to explain
in natural language what the program is doing.  These notes are called
{\bf comments}, and they are marked with the {\tt \#} symbol:

\beforeverb
\begin{verbatim}
# compute the percentage of the hour that has elapsed
percentage = (minute * 100) / 60
\end{verbatim}
\afterverb
%
In this case, the comment appears on a line by itself.  You can also put
comments at the end of a line:

\beforeverb
\begin{verbatim}
percentage = (minute * 100) / 60     # caution: integer division
\end{verbatim}
\afterverb
%
Everything from the {\tt \#} to the end of the line is ignored---it
has no effect on the program.  The message is intended for the programmer or
for future programmers who might use this code.  In this case, it
reminds the reader about the ever-surprising behavior of integer division.

\index{integer division}

This sort of comment is less necessary if you use the integer division
operation, \verb+//+.  It has the same effect as the division
operator\footnote{For now.  The behavior of the division operator may
change in future versions of Python.}, but it signals that the effect
is deliberate.

\beforeverb
\begin{verbatim}
percentage = (minute * 100) // 60 
\end{verbatim}
\afterverb
%
The integer division operator is like a comment that says, ``I know
this is integer division, and I like it that way!''

\section{Glossary}

\begin{description}

\item[value:]  A number or string (or other thing to be named later)
that can be stored in a variable or computed in an expression.

\item[type:]  A set of values.  The type of a value determines how
it can be used in expressions.  So far, the types you have seen are integers
(type {\tt int}), floating-point numbers (type {\tt float}),
and strings (type {\tt string}).

\item[floating-point:] A format for representing numbers with fractional
parts.

\item[variable:]  A name that refers to a value.

\item[statement:]  A section of code that represents a command or action.  So
far, the statements you have seen are assignments and print statements.

\item[assignment:]  A statement that assigns a value to a variable.

\item[state diagram:]  A graphical representation of a set of variables and the
values to which they refer.

\item[keyword:]  A reserved word that is used by the compiler to parse a
program; you cannot use keywords like {\tt if}, {\tt  def}, and {\tt while} as
variable names.

\item[operator:]  A special symbol that represents a simple computation like
addition, multiplication, or string concatenation.

\item[operand:]  One of the values on which an operator operates.

\item[expression:]  A combination of variables, operators, and values that
represents a single result value.

\item[evaluate:]  To simplify an expression by performing the operations
in order to yield a single value.

\item[integer division:]  An operation that divides one integer by
another and yields an integer.  Integer division yields only the
whole number of times that the numerator is divisible by the
denominator and discards any remainder.

\item[rules of precedence:]  The set of rules governing the order in which
expressions involving multiple operators and operands are evaluated.

\item[concatenate:]  To join two operands end-to-end.

\item[composition:]  The ability to combine simple expressions and statements
into compound statements and expressions in order to represent complex
computations concisely.

\item[comment:]  Information in a program that is meant for other
programmers (or anyone reading the source code) and has no effect on the
execution of the program.

\index{value}
\index{floating-point}
\index{variable}
\index{type}
\index{keyword}
\index{statement}
\index{assignment}
\index{comment}
\index{state diagram}
\index{expression}
\index{operator}
\index{operand}
\index{integer division}
\index{rules of precedence}
\index{precedence}
\index{concatenation}
\index{composition}

\end{description}
